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(57) Abstract 

Fusion genes are constructed between a soluble CD4 (sCD4) gene, or portions thereof, and the genes of a lysosome tai^get- 
ing domain. Upon the biosynthesis of the fusion proteins in HIV infected cells, the sCD4 moiety binds newly synthesized gpl60 
and the lysosomal targeting moieties transport the entire complex to lysosomes. The genes diverts HIV coat glycoprotein to lyso- 
somes for degradation, thus preventing the assembly of new virions and the propagation of HIV. The same process can be used 
for the treatment of other retro^nises. 
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FUSION PROTEINS TARGETED TO LYSOStWES. FOR THE TREATMENT OF AIDS 

Background of the Invention 
In 1981, acquired immune deficiency syndrome (AIDS) was 
identified as a disease diat severely compromises the human immune 
isystem, and that almost without exception leads to death. In 1983, the 
etiological cause of AIDS was determined to be the human 
5 immunodeficiency virus (HIV). In December, 1990, the World Health 
Organization estimated that between 8 and 10 million people worldwide 
were infected with HIV, and of that number, between 1,000,000 and 
1,400,000 were in the U.S. 

There are at least two types of HIV, Type I and Type n. Both 
10 preferentially infect T4 helper T lymphocytes and macrophages by 

interacting with the molecule CD4 on the sur&ce of the target cell. All 
viruses infect cells by binding to the cell of an envelope protein. In die 
case of HIV, the envelope protein is gpl20; the cell surface protein is an 
antigen called CD4. The viral membrane tiien apparently fuses with the 
15 cell membrane and the viral genes are injected into the cell, where they 
are replicated and new virions assembled using the host replicative 
- processes. In some cases, the viral DNA most be integrated into the host 
genome, where it can remain latent for many years. 

This replicative process has made it extremely difficult to treat, or 
20 as imporlanfly, to cure HIV inflection. Most attenqsts to vaccinate people 
against die disease have been unsuccessful; but at best would only limit 
infection. Most drugs have been targeted to replication of the viral 
nucleic add. 

In 1985, it was reported that Ae synthetic nucleoside 3'-azido-3'- 
25 deoxythymidine (AZT) inhibits the replication of human 

immunodeficiency virus type 1. Since then, a number of other synthetic 
nucleosides, including 2',3'-dideoxyinosine (DDI), 2\3*-dideoxycytidine 
(DDC), 3'-fluoro-3'-deoxythyniidine (FLT), 2',3'-dideoxy,2\3'- 
didehydrothymidine (DAT), and 3'-azido-2*,3'-dideoxyuridine (AZDU), 
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have been demonstrated to be effective against HIV, although none are 
able to core the disease nor do more than prolong the life expectancy of 

Ae infected individuals. 

The only drug commercially available for the treatment of AIDS, 
5 3'-Azido-3'-deoxythymidine is a potent inhibitor of HIV reverse 

tiansciqMase. However, the benefits of AZT must be weighed against 
the severe adverse reactions of bone marrow sui^uession, nausea, 
myalgia, insomma, severe headaches, anemia, peripheral neuropafliy, and 
seizures. These adverse side effects often occur immediately after 

10 treatment begins, even though a minimum of six weds of therapy is 
necessary to realize AZT's benefits. DDI, which has recently been 
approved by the FDA for the clinical testing for the treatment of AIDS; 
is also associated with a number of side effects; including sporadic 
pancreatis and per^heral neuropathy. 

15 It is therefore apparent ttiat there remains an important need to 

develop altonative dien^es for treatment of HIV infections. 

Gene therapy to achieve "intracelhilar immunization", as described 
by Baltimore, D. Nature 335: 395-396 (1988), against AIDS and other 
viral infections, especially of infections with retroviruses, offers definitive 

20 advantages because its successful application can potNitially provide flie 
patients with an intrinsic means to control the disease. To develop gene 
therapies for AIDS, it is equaUy important to generate new genes which 
can be used in AIDS therapy and to have the technology for gene 
transfer into the patients. Rapid advances in the technology for efficient 

25 gene tnuisfer in vitm and in vivo have been made recently, for example, 
as reported by Friedmann, T. Science 244: 1275-1281 a989). The 
developments of gene transfer vehicles, such as viral vectors, has led to 
dinical experimmtation of gene therapy. 

It is e:q>ected diat the advances in gene Hhcxapy technology will 

30 continue. However, Aere are very few gaies which have been 
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demonstraled to be effective against HTV and potratially useful for gene 
therapy against AIDS. HIV glycoprotein gpl60 (precursor of gpl20 and 
gp41) is synthesized on polysones and transported through the secretory 
pafliway [oidoplasmic reticulum (ER), Golgi, and seoetory vesicles] to 
S the cell surface where the assembly of new virions takes place. The - 
strong binding of viral gpl20 to cell surface receptor CD4 is the primary 
route of HTV invasion of human cells (Klatzman, D., et al.. Science 225: 
59-63 (1984); Sattentau, QJ., et al., SS3SDS& 234: 1120-1123 (1986); 
Deen, K.C., et al., Nanne 331: 82-84 (1988); Traunecher, A., et al., 

10 Nature 331: 84-86 (1988); Dalgleish, A.G., et al., j^alUIfi 312: 763-766 
(1984); Maddon, P.J., et al., £^ 47: 333-348 (1986); Ho, D.D., etal., 
J. Clin. Invest. 77: 1712-1715 (1986); Gartner, S., et al., Sseacfi 233: 
. 215-219 (1986)). The newly synthesized gpl60 in lOV-infected T 
lymphocyte cells can bind die i»wly syntheszed CD4 molecules in the 

15 ER (Hoxie, J.A., et al., Science 234: 1123-1127 (1986); and Kawamura, 
L, et al., J. Viol. 63: 3748-3754 (1989)). 

Bucmocore and Rose constructed a modified sCD4 with an 
addition to the C-terminus of a 6-residue sequence (Sequence ID No. 5), 
SEKDEL, yftach is the signal for ER retention (Nfunro, S., and PeDiam, 

20 HJI.B. £d! 48: 899-907 (1987)). The modified CD4, sCD4-KDEL, 
stayed in the ER and prevented tte newly synthesized gpl60 from 
reaching the cell surface (Buonocore, L., and Rose, J.K. Nature 345: 
625-628 (1990)). The ER residence of sCD4-KDEL has an clear, 
limitation as a therapeutic agent, however. Proteins which reside in the 

25 ER, such as BiP, are transferred along the newly synthesized secretory 
proteins to the salvage compartment where the ER, residence proteins are 
sotted and returned to ER (Pdham, HJI.B. Ann. Rev. Cell Biol. 5: 1- 
23 (1989)). For sCD4-KDEL to be effective as an anti-HIV agent, it 
must be synthesized continuoudy and at a level higher than that for 

30 gpl60. This means diat die continuing syndiesis of sCIM-KDEL will 
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ultimatBly exceed the capacity of soiting mechanism in the salvage 
compartment After that point, the newly synthesized sCD4-KDEL and 
gplfiO wiU be lost from the ER, rendering sCD4-KDEL ineffective 
against HIV. It may also cause the loss of native ER residence proteins. 
Moreover, sCD4-KDEL was not shown to resist HIV infection or 
propagation. 

It is therefore an object of the present invention to provide a gene 
dierapy for use in treating or preventing AIDS and other retroviral 
diseases. 



IQ Abstract of the Invention 

This invention consists of the design and demonstration of fusion 
genes which can be used in the gene therapy for treating acquired 
immunodeficiency syndrome (AIDS) and other retroviruses. The 
principle of the therapeutic function is that upon transfer of these genes 

15 into human cells, flie genes direct the synthesis of fusion proteins which 
interfere with die normal fimction of human immunodefident virus 
(HIV), the causative agent of AIDS. The thra^peutic genes are fosions 
between the genes encoding soluble CD4 (sCD4) (or other protein 
required for binding of virus to the target cell) and a lysosome targeting 

20 protein domain. Results have shown that when the fiision genes are 
expressed, the sOX moiety binds HIV glycoprotein gpl60 in the 
endoplasmic reticulum while the lysosome targeting moiety transports the 
entire complex to the lysosomes for degradation. Thus the therapeutic 
genes prevent gpl60 from reaching the cell sut&ce, stop^g the 

25 assembly of new virions and the propagation of HIV. The lysosome 
targeting domains successfully used in the fosion genes are human 
procadiepsin D (PCaD) and parts of human lysosomal membrane proteins 
lamp-1, lamp-2, and acid phosphatase. 
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The experimental evidences that established the predicted function 
of the new genes include the following: (a) the transfection of gpl60 
gene into IkLa cells resulted in finding gpl60 protein on die cell 
sair&ce. The transfection of one of the dierapeutic genes with ^160 
S genes into HeLa cells stopped gpl60 to reach the cell surface; (b) in the 
presence of one of die therapeutic genes in HeLa cells, gpl60 protein is 
degraded more rapidly than in the absence of the therapeutic gene; (c) 
the newly synthesized therapeutic proteins in HeLa cells are cleaved and 
digested in a manner diaracteristic of lysosomal activity; (d) transfecting 
10 one of die therapeutic genes into cultured T-lymphocyte cell line CEM 
inhibited the propagation of HIV. 

Brief Description of the Drawings 
Figure 1 is a schematic presentation of the overall strategy of the 
synthesis and secretion of proteins in the endoplasmic reticulum (ER) and 

15 Golgi. In ER, gpr60 of HIV is synthesized and glycosylated, and a 

fusion protein of soluble CD4-procathepsin D (sCD4-PCaD) is expressed 
from a cloned gene. The pro-cathepsin moiety of the sCI>4-PCaD is N- 
glycosylated and mannose phosphwylated since it is a lysosomal enzyme. 
Then die gpl60 binds to the sCD4 moiety of sCD4-PCaD. The complex 

20 is transported through die ds-Golgi to trans-Golgi. In trans-Golgi 

network, the niannose-6*phosphate (Man-POJ receptors bind mannose-6- 
phosphate of the PCaD moiety and target the whole complex, including 
gpl60, to the lysosomes. In the lysosomes, procathepsin D activates to 
cathepsin D, and gpl60 and sCD4 moieties are proteolytically degraded. 

25 This strategy is designed to prevent gpl60 from entering the secretory 
pathway to reach die cell surface, dius preventing the assembly of HIV. 

Hgure 2 is a schematic of the construction of sQM-PCaD. The 
PCR primers used are: 
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P-1, 5'-GAATTCAAGCCCAGAGCCCTGCC-3' (Sequence ID 
No.' 6) 

P-2, S'-TCTAGAGGCCATTGGCTGCACCG-B' (Sequence ID 
No.'?) 

5 P-3, 5'-TCTAGACTCGTCAGGATCCCGCTG-3' (Sequence ID 

No.' 8) 

P-4, 5'-GTCGACCTAGAGGCGGGCAGCC-3' (Sequence ID No. 
9) ' 

Hguie 3 is a schematic of the construction of sCD4-HAP. The 

10 PGR primers used aie: 

P-1 = 5'.TCTAGACAGCTGGCAAGCGGTCCTG-3' (Sequence 
ID No. 10) 

P-2 = 5'-GTCGACTCAGGCGTGGTCCTCCCC-3' (Sequence ID 
No. 11) 

15 Figure 4 is a schranatic of Ae ccmstruction of sCD4tL1. The 

PGR primers used are: 

P-l = 5'-TCTAGACTGCTGGACGAGAACAGCAC-3' (Siequence' 
ID No. 12) 

P-2 = 5'-GTCGACACCAGGCTAGATAGTCTGGTAG-3' 
20 (Sequence ID No. 13) 

Figure 5 is a schematic of the construction of sC3>4-L2. The 

PGR primers used are: 

P-1 = 5'-TCTAGAAGTGCAGATGACGACAACTTC-3' 
(Sequoice ID No. 14) 

25 P-2 = 5'-GTCGAGCTAAAATTGCTCATATCCAGCATG-3' 

(Sequence ID No. 15) 

Figure 6 is a graph of reverse transcriptase (millions dpm/ml) 
versus time for blank vectur pRc/RSV (1, -X-); no DNA (2, squares); 
pRc/RSV - sCD4-PCaD (3, inverted tiaangfe); pRc/RSV - sCD4-HAP (4, 
30 -#-); pRc/RSV - SCD4-L2 (5, triangle); and pRc/RSV - sCD4-L2 (6, - 
£amond-). 
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Figures 7A and B are grq>hs of A-hexosaminidase and density 
(g/ml) versus gradient fraction for PercoU gradients of lysosomes, 
showing the distribution of gpl60 in fractions from Percoll density 
gradient centrifiigation in the presence and absence of sCD4-PCaD gene 
S e?q)ression. Figure 7A shows the iS-hexosaminidase activity (solid line)» 
density (broken line), and the autoradiography of the gel electrophoresis 
from cells transfected with gpl60 alone. Figure 7B are the same data 
from cells cotransfected with gpl60 and sCD4-PCaD genes. 

DetaOed Description of the Invmtion 

10 A method for treating retroviral infections, especially human 

immimodefidency virus (HIV), wherein a fusion protein is created diat 
binds to the viral envelope protein as it is formed and transports tiie 
envelope protein to lysosomes where it is degraded. The result is that 
the ability of the virus to replicate within the cell producing the fusion 

15 protein is limited. The fusion protein consists of two components: the 
protein which binds to the viral envelope protein and a protein (or 
domain of the protein) which targets the fusion protein to a lysosome. In 
the preferred embodiment for treating HIV, the first protein is soluble 
QM, which binds HIV glycoprotein gpl60, and the second protein is 

20 procathepsin D (PCaD), parts of human lysosomal mraibrane proteins 
lamp-1, lamp-2, or acid phosphatase. In the most preferred embodiment, 
the fusion protein is sCD4-Ll or sCD4-L2. 

The teachings of the following references cited herein are 
spedficaUy incorporated herein as exemplifying methods or reagents 

25 useful in constructing and using the fusion proteins for treatment of viral 
disorders. 

The sCD4 fusion proteins which bind to gpl60 and sort as a 
complex to lysosomes for degradation have the advantage over the ER 
retention shown by the modified sCIM with an addition to the C-tenoinus 
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of a 6-iesidue sequence (Sequence ID No. 5), SEKDEL, which is the 
signal for ER retention, of Buonocore and Rose, because of the 
continuing removal of bound gpl60 from the ER/Golgi system. 

sCD4 in the fasion gene can be substituted by its parts, domain 
5 Dl or combined domains D1-D2. sCD4, the extracenular segment of 
054, conasts of four tandem inmiunoglobuUn-like domains. The N- 
teoninal domain of sCD4, Dl, by itself binds gpl20 with high afBnily, 
as reported by Arifaos, J., et al. 57, 469-481 (1989). Active 
recombinant Dl and D1-D2 domains have been obtained by (Aithos, et 

10 al.; Chao, B.H.. et al., J- Biol. Chem. 264, 5812-5816 (1989); 

Traunecker, A., et al., Nature 331, 84-86 (1988); and Berger, E., et al. 
Prnc. Nad. Acad. Sci. USA . 85: 2357-2361 (1988)). indicating that these 
domains are capable of independent folding. 

Two types of lysosomal targeting components are particularly 

15 suited for the fusion of sCD4. First is a lysosomal proenzyme which 
contains a structural marker for lysosomal targeting. Procafliepsin D 
^CaD) was diosen as the lysosrane targeting domain of die prototype 
tiiMapKitic gene for Ae reason diat much is known of its structure and 
fonction relationships (lang, J. and Wong, R.N.S. f, Cell- Biochgm. 33: 

20 53-63 (1987); Takahashi, T., et al., J. Biol. Ctemr, 258: 2819-2830 
(1983); Shewale, J.G., and Tang, J. Pmc. Natl. Acad. Sci. USA, 80: 
3703-3707 (1984); Yonezawa, S., et al., J. Biol. Chem. 263: 16504- 
16511 (1988); Faust, P.L., etal., Pmn. Nad. Acad. Sci. USA. 82: 4910- 
4914 (1985)), its sotting medianism via mannose-6-phosphaie receptors 

25 in the tcans-Golgi network (Eomfoid, S. and Melhnan, I. Aim. RfiV. C^tt 
Biol. 5: 483-525 (1989)), and the spontaneous acti^vation of its precursor, 
procathepsin D, in the lysosomes (Qasilick, A., et al., Ew. T» Biochem, 
125: 37-321 (1982)). 

The second type of lysosomal targeting domain for sCD4 fosion is 

30 takwi from part of tfie lysosomal membrane proteins. Three human 
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lysosomal membrane protdns, lamp-1 (LI), lamp-2 (I^), and tysosomal 
add phosphatase (HAP) (Fukuda, M., et al., J. Biol, qhtm. 263: 18920- 
18928 (1988); PoWmann, R. et al. EMBQ J. 7: 2343-2350 (1988); and 
Waheed, A. et al. ibid. 7: 2351-2358 (1988)); each contains in its C- 
5 tenninus a "lysosomal targeting signal" (LIS) region which consists of a 
short membrane-anchoring sequence and a short C3^solic domain 
(Komfeld, S. and Mellman, I. Ann. Rev. Cell Biol. 5: 483-525 (1989); 
Peters, C, et al., EMBQ J. 9: 3497-3506 (1990); William, M.A. and 
Fukuda, M. J. Cell Biol. Ill: 955-966 (1990)). These three sCD4-LTS 

10 fusion graies, sCD4-HAP, sCD4-Ll, and SCD4-L2, fonn die second 
group of therapeutic genes. The basic function of the sCD4-LTS 
then^utic genes is the same as tiiat in sCD4-PCaD, even though die 
targeting mechanisms are different between tiiese two groups. 
The PCaD moiety can be substituted for by other sohible 

15 lysosomal enzymes, such as other lysosomal soluble proteins containing 
mannose-6-phosphate markers. Some examples of human lysosomal 
enzymes with known cDNA structures and their cDNA sizes are a-N- 
Acetylgalactosaminidase, 1.3 kbp (Wang, A.M., et al., J. Biol. Chem. 
265: 21859-21866 (1990)); Glycosylasparaginase, 1.1 kbp (Fisher, KJ., 

20 et al., FEBS Lett 276: 440-444 (1990)); Ghicocerebrosidase, 1.8 kbp 
(Tsuji, S., et al., J. Biol. Chem. 261: 50-53 (1986)); Procalhepsin L, 1.1 
kbp (Gal, S. and Gottesman, MM. Biochem. J. 253: 303-306 (1988)); 
Procathepsin B, 1.1 kbp (Chan, S.J., et al., Proc. Natl. A cad. Sci. USA 
83: 7721-7725 (1986)); and Procathepsin E, 1.2 kbp (Azuma, T., et al., 

25 J. Biol. Chem. 264: 16748-16753 (1986)). The lysosomal targeting 
domains (the transmembrane domains and <^tosolic domains) of other 
fysosomal membrane proteins (human or other species origins) can also 
be substituted for human HAP, LI and L2 in die therapeutic fusion 
genes. 
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It is believed liiat it will be possible to use more than one 
lysosome targeting domain in a therapeutic gene. Multiple targeting 
domains can increase the efficiency and capacity of Ae gpl60 
transporting to lysosomes. Examples of muWdomain dierapeutic genes 
5 are sCD4-PCaD-HAP, sCD4-PCaD-PCaD, and other combinations. The 
linker peptide between the sCD4 and the lysosome targeting domain can 
also be altered in amino acid sequence and in length to achieve different 
efEldendes and capacity of gplfiO transporting to lysosomes. 

The gene encoding flie fusion protein is constructed using standard 

10 genetic engineering techniques, as desoibed in detail below for treatment 
of HIV. The fusion gene is tiien introduced into cells in yitro using 
methods such as caldnm phosphate cqprecipitation, Iqwfection 
(liposomes), ceH ftision, electroporation, or a vector such as a vaccinia 
virus. For example, CEM-SS cells were gro>wn to exponratial phase. 

15 Plasmids containing 10 it% each of the sCD4-LTD genes were separately 
transformed into 2 x 10' cells by electroporation using die metiiod of 
Aldovini, A. and M.B. Feinberg, Techniques in HIV Rcseardi eds. 
Aldovini, A. and B.D. Walker, pp. 147-176 (Stockton Press, NY 1990). 

For in vivo applications, the pr^ied method is by transfection 
20 using a vector system such as the vaccinia virus. The vector is not 

limited however, for efficient expression, mammalian or viral promoters 
and other regulatory elements must be present A recent account of die 
progress in goie therapy of humans is desoibed in Friedmann, T., 
Science 244: 1275-1281 (X9%9). Some vector systrans fijr introduction of 
25 tiiraapeutic genes into AIDS patients have been desaibed by Chimada, 
T., et al., J. Clin. LivesL 88:1043-1047. 

The following examples demonstrate flie eflfectivCTCSS of flie 
method for treating HIV infection. However, tiie same strategy could be 
used widi other retroviral infections, such as hepatitis B and HTLV-1 
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caused leukemia. In these diseases, the cell surface protein receptors 
which bind virus glycoprotein can be used in fusion with lysosome 
targeting protein. 

The present invention will be further understood by lefexence to 
5 the following non-limiting examples of the construction and cloning of 
therapeutic fusion genes, and their efficacy in degrading HIV envelope 
protein as it is formed. 

Example 1: Construction and Cloning of the Therapeutic Genes. 
Four sCD4-fusion genes, sCD4-PCaD, sCD4-HAP, sCD4-Ll, and 

10 sCI>4-12, were constructed as shown in die following sequences. 

Schematics of their constructions are shown in Figures 2-5, respectively. 
The gene encoding hunan procathepsin D cDNA was reported by Faust, 
et al., (1985). The gene encoding human lamp-1 and lanip-2 was 
reported by Fukuda, et al., (198). The gene encoding human HAP was 

15 reported by Pohlmann, et al., (1988). 

The nucleotide sequence encoding sCD4-PCaD is shown below in 
Sequence ID No. 1. Underlined letters are engineered restriction sites. 



GAATTCA AGC OCaOAOCCCT GCCATTTCTG 
CCTCCCTCGG CAA6GCCACA ATGAACOGGG 
TGCAACTGGC GCTCCTCCCA GCAGCCACTC 
G6GATACAGT G6AACTGACC TGTACAGCTT 
AAAACTCCAA CCAGATAAAG ATTCTGGGAA 
CCAAGCTGAA TCAXCGCGCT 6ACTCAA6AA 
TGATCATCAA GAATCTTAA6 ATAGAAGACT 
A6AA6GAGGA 6GTGCAATTG CTAGTGTTCG 
TTCAGG6GCA GAGCCTGACC CTGAOCTTGG 
AATGTA6GAG TCCAA6GGGT AAAAACATAC 
TGGAGCTCCA 6GATAGTGGC ACCTG6ACAT 
A6TTCAAAAT AGACATCGTG 6TGCTA6CTT 
AAGA66GGGA ACAGGIGGAG TTCTCCTTCC 
6CAGT6GCGA 6CT6TG6TG6 CAGGCGGAGA 
TTGAOCTGAA 6AACAAGGAA. 6TGTCTGXAA 



TGGGCXCAGG 


TCCCTACTGC 


TCAGCCCCTT 


GAGTCCCTTT 


TAGGCACTTG 


CTTCTG6TGC 


AGGGAAAGAA 


AGTGGTGCTG 


GGCAAAAAAG 


CCCAGAAGAA 


GAGCATACAA 


TTCCACTGGA 


ATCAGGGCTC 


CTTCTTAACT 


AAAGGTCCAT 


OAA6CCTTTG 


GGACCAAGGA 


AACTTCCCCC 


CAGATACTTA 


CATCTGT6AA 


6T6GAGGACC 


6ATTGACTGC 


CAACTCTGAC 


ACCCACCTGC 


A6AGCCCCCC 


TGGTAGTAGC 


CCCTCAGTGC 


AGGGGGGGAA 


GACCCTCTCC 


6TGTCTCAGC 


GCACTGTCTT 


GCA6AACCA6 


AAGAAGGTG6 


TCCAGAAGGC 


CTCCAGCATA 


6TCTATAAGA 


CACT06CCTT 


TACAGTTGAA 


AAGCTGACGG 


G6GCTTCCTC 


CTCCAAGTCT 


TGGATCACCT 


AACGGGTTAC 


CCA6GACCCT 


AAGCTCCAGA 
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SCTCCCGCTC 


CACCTCACCC 


TGCCCCAGGC 


CTTGCX^CAG 


TAT6CTGGCT 




W»>W%»*^* 


CXT6AAGCGA 


AAACAGGAAA 


GTTGCATCAG 


GAAGTGAACC 




GAGA6CCAGT 


CAGCTCCAGA 


AAAATTTGAC 


CTGTGAGGTG 


TGGGGACCCA 




CCTGAT6CT6 


AGCTTGAAAC 


TGGA6AACAA 


GGAGGCAAAG 


GTCTCGAAGC 


a ^ a Afl/Sf* 




CTGAACCCTG 


AGGCGGGGAT 


GTGGCAGTGT 


CTGCTGAGTG 






GAATCCAACA 


TCAAGGTTCT 


GCCCACATGG 


TCCACCCCGG 


TCKZAw wwAa X 




CTCGTCAGGA 


TCCCGCTGCA 


CAAGTTCACG 


TCCATCCGCC 






GGCTCTGTGG 


AGGACCTGAT 


TGCCAAA6GC 


CCCGTCICAA 


AGTACTCCCA 




GCCGTGACSOG 


AGGGGCCCAT 


TCCCGA6GTG 


CTCAAGAACT 


AGATGGaCuw 


WWAvXAwXAW 


GGGGAGATTG 


GCATCGGGAC 


6CCCCCCCAG 


TGCTTCACAG 


TCGXCTTC6A 




TCCAACCXGT 


GGGTCCCCTC 


CATCCACTGC 


AAACXGCTGG 


ACATCGCTTG 


r'»PC#*lkTiff*AC 
wXwwAXA^wnU' 


CACAAGTACA 


ACAGCGACAA 


GTCCAGCACC 


TACGTGAAGA 


AT6GTACCTC 


^fpipi|V2|VPATP 
X xvA^«AXO 


CACTATGGCT 


C6G6GAGCCT 


CTCCGGGTAC 


CTGAGCCAGG 


AGACX6TGTC 


GGTvCCCXvU 


f*ACS'PCACCGT 


CGTCAGCCTC 


TGCCCT6GGC 


GGTGTCAAAG 


TG6A0AGGCA 


GGTCTTT6GG 


6AG6CCACCA 


AGCAGCCAGG 


CATCACCTTC 


ATCGCAGCCA 


AGTTC6ATGG 


CnxwCxXnMl.* 


A'FGGCCTACC 


CCC6CATCTC 


CGTCAACAAC 


GTGCTGCCCG 


TCTTC6ACAA 


CCJaj'aXvwA17 


CAGAAGCT6G 


TGGACCAGAA 


CATCTTCTCC 


TTCTACCTGA 


GCAGGGACCU 




CCTGGG6GTG 


AGCTGATGCT 


GGGTGGCACA 


GACTCCAAGT 


ATTACAA6GG 


flfi flft Ki^^^ns^'f 
XxCXwXvXi^^ 


*P A f P TG AATG 


TCACCCGCAA 


GGCCTACTGG 


CAGGTCCACC 


TGGACACAGG 


CACTTCCCTC 


ATGGT6GGCC 


CG6TGGATGA 


6GTGCGCGAG 


CTGCAGAAGG 


CCATOGGGGC 


CGTGCCGCT6 


ATTCAGGGCG 


AGTACATGAT 


CCCCTGTGAG 


AAGGTGTCCA 


CCCTGCCCGC 


GATCACACTG 


AA6CTGGGAG 


6CAAAGGCTA 


CAAGCTGTCC 


CCAGA6GACT 


ACAC6CTCAA 


GGTGT06CAG 


6CCGGGAAGA 


CCCTCT60CT 


6AGCGGCTTC 


ATGGGCATGG 


AGATCCCGCC 


ACCAGC6GG 


CCACTCXGGA 


TCCTGGGOGA 


CGTCTTCATC 


GGCCGCTACT 


ACACTGTGXT 


TGACCCT6AC 


2ACAACAG6G 


T6G6CTT0GC 


CGAGGCTGCC 


C6CCTCTAGC 



AGCTB 

The nucleotide sequence encoding sCD4-HAP is shown below in 
Sequence ID No. 2. Underlined letteis are engineered restriction sites. 



CRMICa ftGC CCaGaCCCCT OCC&TTTCTO TQGGCTCaiGG TOCCTACTGC TCMCCCCTT 
CCTCCCTCGG CftMGCCACA ATGAACOGGG GaOTCCCMT laOGCACSTTG CTTCTGGTGC 
TGCaACTGGC 6CT0CTCCCR GCftGCCaCTC AGGGAAAOAA AGTG6TGCT6 OGCAAAAAAG 
GGCaaawaWST GGAACTGACC TGIACMCTT CCCaOAAGAA GAGCATACAA TTCCRCTG6A 
AAAACTCCAA CCAGaiAAAG ATTCieCGAA AICAG6GCIC CTTCTIAACT AAAGGTOaT 
CCAAGCTGAA TOAIC60QCT OACTCAA6AA GAAGCCTTTG OOACCAAGGA AACTTCCCCC 
TGATC&TCAA CAATCTTAAG AXA6AA6ACT CA6AIACTTA CATCI6T6A& 0TC6A0GACC 
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A62VAGGAGGA 


6GTGCAATTG 


CTAGTGTTCC 


GATTGACTGC 


CAACTCTGAC 


ACCCACCTGC 


TXCAGGGGCA 


GAGCCTGACC 


CTGACCTTGG 


AGAGCCCCCC 


TGGTA6TAGC 


CCCTCAGTGC 


AATGTAGGA6 


TCCAAGGGGT 


AAAAACATAC 


AGGGGGGGAA 


GACCCTCTCC 


GTGTCTCAGC 


T60A6CTCCA 


GGATAGT6GC 


ACCTGGACAT 


GCACTGTCTT 


GGAGAACCA6 AAGAAG6TGG 


AGTTCAAAAT 


AGACATCGTG 


6TGCTAGCTT 


TCCA6AAGGC 


CTCCA6CATA 6TCTATAAGA 


AA6A666GGA 


ACA66TG6A6 


TTCTCCTTCC 


CACTC6CCTT 


TACAGTT6AA AAGCTGA0G6 


6CA6TG606A 


GCT6TGOTGG 


CA66C6GAGA 


G6GCTTCCTC 


CTCCAA6TCT 


TGGATCAOCT 


TTOACCT6AA 


6AACAAG6AA 


GTGTCTGTAA 


AACX36GTTAC 


CCAGGACCCT 


AA6CTCCAGA 


T6GGCAAGAA 


GCTCCC6CTC 


CACCTCACCC 


TGC3CCCAGGC 


CTT6CCTGA6 


TATGCTGGCT 


CT6GAAACCT 


CACCCTGOCC 


CTT6AAGC6A 


AAACAGGAAA 


6TX6CATCAG 


6AAGT6AACC 


TQGT667GAT 


GAGAGCCACT 


CAGCTCCA6A 


AAAATTTGAC 


CTGTGA06TG 


TGGGGACOCA 


CCTCCCCTAA 


6CTGATGCT6 


AGCTT6AAAC 


T0GA6AACAA 


GGAGGCAAAG 


GTCTCGAAGC 


6G6A6AAG6C 


GGTGTGG6TG 


CTGAACCC7G 


AGGCGGGGAT 


6T6GCAGTGT 


CTGCTGAGT6 


ACTC60GACA 


GGTCCTGCTG 


GAATCCAACA 


TCAAGGTTCT 


GCCCACATGG 


TCCACCCCGG 


T6CA6CCAAT 


GGCCTCTAGA 


CAGCTGGCAA 


GCGGTCCTGC 


AGACACAGAG 


GTGATTGTGG 


CCTTGGCETGT 


ATGTGGCTCC 


ATCCTCTTCC 


TCCTCATAGT 


GCTGCTCCTC 


ACOGTCCTCT 


TCC6GATGCA 


GGCCCAGCCT 


CCTGGCTACC 


GCCACGTCGC 


AGATGG6GAG 


GAGCAC6GCT 


6AGTC6AC 













The nucleotide sequence encoding sCD4-Ll is shown bdow in 
Sequence ID No. 3. Underlined letters are engineered restriction sites. 



GAATTCAAGC 


CCAGAGCCCT 


GCCATTTCTG 


TGGGCTCAGG 


TCCCTACTGC 


TCAGCCCCTT 


CCTCCCTCGG 


CAAGGCCACA 


ATGAACOKSG 


GAGTCCCTTT 


TAGGCACTTG 


CTTCTGGTGC 


TGCAACTGGC 


GCTCCTCCCA 


GCAGCCACTC 


AGGGAAAGAA 


AGTGGTGCTG 


GGCAAAAAAG 


GGGATACAGT 


GGAACTGACC 


TGTAGAGCTT 


CCGAGAAGAA 


GAGCATACAA 


TTCCACT6GA 


AAAACTCCAA 


CCAGAXAAAG 


ATTCTGGGAA 


ATCAGGGCTC 


CTTCTTAACT 


AAAGGTCGAT 


CXAAGCT6AA 


T6ATCGCGCT 


GACTCAA6AA 


GAAGCCTTTG 


GGACCAAGGA 


AACTTCCCCC 


TGATCATCAA 


GAATCTTAAG 


ATAGAAGACT 


GAGATACTTA 


CATCTGTGAA 


6T6GAGGACC 


A6AAGGAGGA 


G6TGCAATTG 


CTAGTGTTCG 


GATTGACTGC 


CAACTCTGAC 


ACCGUCCTGC 


TTCAGGGGCA 


GAGCCTGACC 


CTGACCTTGG 


AGAGCCCCCC 


TGGTAGTAGC 


CCCTCAGTGC 


AATGTA6GAG 


TCCAAGGGGT 


AAAAACATAC 


AGGGGGGGAA 


GACCCTCTCC 


6TGTCTCAGC 


TGGA6CTCCA 


GGATAGTGGC 


ACCTGGACAT 


GCACTGTCTT 


GCAGAACCAG 


AAGAA6GTG6 


AGTTCAAAAT 


AGACATCGTG 


GTGCTA6CTT 


TCCAGAAGGC 


CTCCAGCATA 


GTCTATAAGA 


AAGAGGGGGA 


ACA6GTGGAG 


TTCTCCTTCC 


CACTCGCCTT 


TACAGTTGAA 


AAGCTGACGG 


GCAGTGGCGA 


GCTGTGGTGG 


CAGGCGGAGA 


GGGCTTCCTC 


CTCCAAGTCT 


TGGATCACCT 


TTGACCTGAA 


GAACAAGGAA 


GTGTCTGTAA 


AACGGGTTAC 


CCAGGACCCT 


AAGCTCCAGA 
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TGGGCAAOAA GCTCCC6CTC CACCTCACCC TGCCCC&GGC CTT6CCTCAO TATGCTGGCT 
CTGGAAACXn: CACCCT60CC CTTGAAGCGA AJUICAGOMA GTTGCATCAG 6AAGTGAACC 
TGGTGGXGAT GAGAGCCRCT CAGCTCCAGA AAAATTTGAC .CTGTGAGGTG TOGGOACCCA 
CCTCCCCTRA 6CTGATGCTG A6CTTGAAAC TGGAGAACAA GGAGGCAAAG GTCTCGAAGC 
GGGAGAAGGC GGTGTGGGTG CTGAACCCXG AGGCGGGGAT GTGGCAGTGT CTGCTGAGT6 
ACTCGGGACA GGTCCTGCT6 GAATCCAACA TCAA6GTTCT GCCCACATGG TCCACCX:CGG 
TGCAGCCAAT GGCCTCTAGA CTGCTGGACG AGAACAGCAC GCXGAltCCCC ATCGCTGTGG 
GTGGTGCCCT GGCGGGGCTG 6TCCTCATCG XCCTCATCGC CTACCTOGTC G6CAGGAAGA 
GGAGTCACGC AGGCTACCAG ACTATCTAGC CTGGTGTOGA P 

The nucleotide sequence encoding sC3>4-L2 is shown below in 
Sequence ID No. 4. Underlined letters are engineered restriction sites. 



GAATTCJVAGC CCAGAGCCCT GCCATTTCTG 
CCTCCCTCG6 CAAGGCCACA ATGAACCGGG 
TGCAACTGGC GCrCCTCCCA GCAGCCACTC 
GGGATACAGT GGAACTGAOC TGTACAGCTI 
AAAACTCCAA CCAGATZU^ ATTCIGGGAA 
CCA&6CT6AA TGATC6CGCI GACTCAAGAA 
TGATCAIC2Uk GAACCTTAAG ATAGAAGACT 
AGAAGGAGGA 6GTGCAATT6 CTAGTCTTCG 
TTCAGGGGGA GA6CCTGACC CT6ACCTTGG 
AAIGTAGGAG TCCAAGGG6T A2UIAACATAC 
TGGAGCTCCA 6GAXAGTG6C ACCTGGACAT 
AGTTCAaAAT AGACATCGTG GT6CTAGCTT 
AAGAGGGG6A ACAGGTGGA6 TTCTCCTTCC 
GCAGTGGOSA GCT6TGGTGG CAGGCGGAGA 
TTGACCTGAA 6AACAAGGAA GTGTCTGTAA 
TGGGCAAG2^ 6CTCCCGCTC CACCTCACCC 
CTGGAAACCT CACOCTGGCC CTTGAA6CGA 
T6GTGGTGAT GAGAGCCACT CAGCTCCAGA 
CCTCCCCraA 6CTGATGCTG AGCTIG2^C 
GGGAGAAGGC GGTGTGGGTG CTGAACCCTG 
ACTCGGGACA GGTCCTGCTG GAATCCAACA 
TGCAGCCAAT GGC CTCTAGA AGTGCAGATG 



TGGGCTCAGG TCCCTACTGC TCAGCCCCTT 
GAGTCCCTTT TAGGCACTTG CTTCTGGTGC 
AGGGAAAGAA AGTGGTGCTG GGCAAAAAAG 
CCCAGAAGAA GAGCATAGAA TTCCACIGGA 
ATCAGGGCTC CTTCTTAACT AAA6GTCCAT 
GAAGCCTTTG GGACCAA6GA AACTTCCCCC 
CAGATACTTA CATCTGTGAA GTGGAGGACC 
6ATTGACTGC CAACTCnXSAC ACCCACCTGC 
AGAGCCCCCC TGGTAGTAGC CCCTCAGTGC 
AGGGG6GGAA 6ACCCTCTCC GTGTCTCAGC 
GCACTGTCTT GCAGAACCAG AAGAAGGTGG 
TCCAGAAGGC CTCCAGCATA GTCTATAAGA 
CACTCGCCTT TACA6TTGAA AAGCTGAC6G 
GGGCTTCCTC CTCCAA6TCX TGGATCACCT 
AACGG6TTAC CCAGGACCCT AAGCTOCAGA 
TGCCCCAGGC CTTGCCTCAG TATGCTGGCT 
AAACAGGAAA GTTGCATCAG GAAGTGAACC 
AAAATTTGAC CTGTGAGGTG TGGGGACCCA 
TGGAGAACAA GGAGGCAAAG GTCTCGAAGC 
AGGCGGGGAT GTGGCAGTGT CTGCTGAGTG 
TCAA6GTTCT GCCCACATGG TCCACCCCGG 
ACGACAACTT CCTTGTGCCC ATAGOGGTGG 
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GAGCTGCCTT 6GCAG6AGTA CTTATTCTAG TGTTGCTGGC TTATTTTATT GGTCTCAAGC 
ACCATCATGC TGGATATGAG CAATTTTAG G TCGAC 

Example 2: Demonstration of efficacy of the Therapeutic Genes: 

I. Hela-CD4+ Cells Expressing Gpl60 Formed Syncytia 
which was Reversed by Coexpressing of Therapeutic 
Genes. 

5 HeLa-CD4+ cell are HeLa cells which express CD4 on the cell 

surface. When the gpl60 gene is transfected into HeLa-CD4+ cells, the 
cells have both CD4 and gpl60 on the cell surface. Because CD4 binds 
gpl60 tightly, the cells aggregate and fiise together to form giant cell 
masses called synthetium. When gpl60 and a therapeutic gene are both 

10 transfected into the cells, even though the gpl60 is syndiesized, it does 
not ai9)ear on die cell surface because the protein made from the 
dierapeudc gene transports the gpl60 to the lysosomes instead of the cell 
surface, so no syncytium is formed. 

Syncytium formation was monitored for HeLa-CD4+ cells (clone 

15 HT4-6C) transfected with gpl60 or cotransfected with gpl60 and other 
genes, using a vaccinia vims expression system. The follo^^g results 
were obtained. 



Untransfected cells No syncytium 

Cells transfected with procathepsin 
20 D (control) No syncytium 

Cells transfected widi gpl60 gene Syncytia 

Cells cotransfected with gpl60 and 
sCD4-PCaD gene m die wrong 
cloning direction Syncytia 

25 Cells cotransfected widi gpl60 and 

blank pET-3a vector .Syncytia 

Cells cotransfected with gpl60 and 
dierapeutic gene sCD4-PCaD 
(right direction) No syncytium 
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Cells cotransfected with gpl60 and 

therapeutic gene sCD4-HAP No syncytium 

CeDs cottansfected with gpl60 

and sCm Synqrtia 

5 Cells cottansfected with gpl60 

and SCD4-L1 No syncytium 

Cells cotransfected with gpl60 

and SCD4-L2 No syncytium 

These results were completely reproducible in three experiments 
10 using HeLa-CD4+ done 6C and in one experiment with done 1022. 
These results indicate that ndtfaw sCD4 nor the control vectors could 
reverse the syncytium fijimation. Only the Aetapeutic genes sCD4- 
PCaD and sCD4-HAP can reverse die syncytium formation by gpl60. 
The expression of die lysosome targeting domain with tiie sCD4 is 
15 necessary to prevent the syncytium. 

n. Gpl60 and the Therapeutic Gene Products (sCD4-PCaD 
and SCD4-HAP proteuis) and gpl60 arc Degraded ni the 



The following reag«its used in these studies were obtained 
20 through the AIDS Research and Reference Program, AIDS Program, 
NIAID, NIH: plasmid plllenv3-l (reagent #289, frran Dr. Josq)h 
SodroskO, plasmid pt4B (#157, from Dr. Richard Axel), antiserum to 
fflV-1 gpl20 (#288, from Dr. Michael Phelan), antiserum to CD4 (#314, 
from Dr. Michael Phelan), vector vTF7-3 (#356, from Drs. Tom Fuerst 
25 and Bernard Moss), HeLa cells (#153, &om Dr. Richard Axel), HeLa- 
CD4+ (done 6C) (#459, from Dr. Biuce Chesebro), HeLa-CD4+ (done 
1022) (#1109, from Dr. Bruce Chesebro), and CEM-SS cells (#776, from 
Dr. Peter L. Nara). Rabbit anticathepsin D antiserum was ptoduced as 
described by Huang, J.S. et al., J Binl. Chem. 254, 11405-11417 
30 (1979). 
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HeLa cells were transfected with gpl60 gene or cotransfected with 
gpl60 and one of the therapeutic genes (sCD4-PCaD or sCIM-HAP), 
using a vaccinia ex|nession system for transfection and expression. The 
proteins were metabolicaUy labelled with radioactive methionine (pulse). 
5 At different times (chase), the cells were separated from the culture 

medium and homogenized. Inununopredpitations were performed on cell 
homogenates and media with antibodies against gpl60, CD4, and PCaD, 
respectively. The precipitated proteins were separated on SDS*gel 
electrophoresis and their patterns visualized by exposure to photographic 
10 films. 

The pattern of tfie specific proteins indicate the fates of various 
gpl60 and the therapeutic proteins. The results from the pulse-chase 
. experiments indicate that gpl60 is not degraded when the gene for gpl60 
is transfected alone. When gpl60 and the therapeutic genes are 

15 cotransfected, both proteins are rapidly degraded. The results are 
summarized below. 

The pattern of protein bands immunqprecipitated with gpl60 
antibody after a 4 h chase vAim gpl60 was transfiscted alone, showed 
that botfi gpl60 and gpl20 bands were associated witfi the cells; but only 

20 gpl20 was in the medium due to shedding fnnn cell surfece. 

Cotransfection of gpl60 and sCD4 did not significantly change from 
gpl60 alme. When gpl60 was cotransfected with eidier sCI>4-PCaP 
gene (sCD4-P) or sCD4-HAP gene (sCD4-H), no gpl20 was observed 
either in the cells or in the media, indicating that gpl60 did not enter 

25 into the pathway to the cell surface which also processes gpl60 to gpl20 
and gp4L A band was observed at near 96 kD of the cells with sCD4- 
PCaD cotransfection. This band represents metabolically radiolabeled 
sCIX-PCaD which binds strong to gpl60 and coprecipitates by 
antibody against gpl60. Similaiiy, a 50-ldD band from tiie cells 

30 cotransfected with sCD4-HAP gene represents the fusion protein sCD4- 
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HAP. These results have been completely reproducible in several 
e]q)eninents. 

The results demonstrate that sCW-fosion genes cause gplfiO to go 
to a di^rent transporting pathway. Strong bindings between sCD4- 

5 fusion proteins and gpl60 takes place in the cells. 

The patterns from immunopredpilation of cell homogenates and 
media with antibodies against gplfiO obtained by SDS-gel electrophoresis 
also demonstrate that gpl60 is degraded more i^idly in the presence of 
sCI>4-PCaD gene. Afler a 4 h chase, the combined amount of gpl60 

10 and gpl20 in the cdls transfected with only gplfiO gene abeady fer 
exceeds the gpl60 band from the cells cotransfiected with gpl60 and 
sCD4-PCaD genes. 

After an 18 h chase, the amount of combined gpl60 and gpl20 
(gpl60 transfection) did not appreciably change from the 4 h result 

15 However, the gplfiO band is largely diminished in the cells cotransfected 
wifli the sCD4-FCaD fusion gene. (The intensity of tfiis gpl60 band is 
only about 1/8 of that of tbe combined gpl60 + gpl20.) These results 
indicate gpl60 in the cells synthesizing sCD4-PCaD is r^idly degraded 
asaresultoftheexpressionof the sCD4-PCaD gene in the cells. This 

20 conclusion is consistent wifli the e3q)lanation that gpl60 is transported by 
the fusion protein to lysosomes and degraded. 

Cathepsin D antibody imraunoprecipitates show similar results by 
SDS gd electrophoresis. At the end of a pulse or after a 2 h diase, a 
major band near 97 KD is dearly die fusion protein sOM-PCaD. The 

25 molecular size of this band agrees with flie fiision between sCD4 

(S5»parent size: 46 H)) and procatfaepsin D (apparent size; 50 kD). With 
a 4 h diase, cathqisin D appeared mainly as a 35 kD band. This is 
close to the size of the hunan cathepsin D heavy chain, usuaUy in the 
range of 30 to 35 kD. (The 0.9 kD fight diain is usually too feint to be 

30 seen.) It is known that procafliepsin is rapidly activated to cathepsin D 
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and slowly processed to a 2-chain enzyme in the lysosomes, as reported 
by Hasilik, A. and Neufeld, E.F. J. Biol. Chem, 255: 4937-4945 (1980) 
and Eridcson, A.a, et al„ J. Biol. Chem. 256: 11224-11231 (1981). 
Thus, the time for the fusion protein to reach lysosomes is between 2 to 
5 4 h. In summary, the sCXM-PCaD fusion protein is targeted to die 
lysosomes, activated, and correcdy processed. 

The electrophoretic patterns of CD4 antibody immunoprecipitates 
at different chase times also show similar results. At the end of pulse 
labelling of 20 min (0 cteise time), gpl60 transfection alone provides 

10 only £Eunt bacls^round bands, as no CD4 synthesis is expected in HeLa 
cells. Cotransfection widi sCIM-PCaD produces a major band near 97 
kD. This is the same position for the synthetic band detected with 
cathqisin D antibody. Even as early as at the end of the 20 min pulse 
period, some degradation bands of sCD4 could be seen, with the 46 kD 

15 band appearing at flie same position as the authentic sCI)4; so this band 
must have come from the activation of the procathepsin D moiety of the 
fusion protein in the lysosomes which resulted in the separation of sCD4 
from PCaD. The major band for the cotransfection of sCIM-HAP of 0 
time chase is about 50 kD. Since the lysosome targeting domain from 

20 HAP is only 45 amino add residues, this is the expected size for the 
fusion protein sCD4-HAP. 

After a 2 h chase, much of the labelled sCD4- domain in sCD4- 
PCaD had been degraded to 3 bands in the size range of 25 to 40 Id). 
The sCD4-HAP band had become a doublet due to the degradation of the 

25 targeting domain from the C-terminal of sCD4, which is known to occur 
for all the lysosomal membrane proteins. After 4 h and 16 h chase 
periods, diminishing amounts of material recognizable by CD4 antibody 
were present due to furUier lysosomal proteolysis. 

In conclusion, some sGM-PCaD fusion protein reaches lysosomes 

30 as early as 20 min. At the end of 2 h, tiie degradation of sCD4-fusion 



wo 93/06216 




PCr/US92/08090 



proteins is extensive. At 16 h, the degradation is nearly complete. The 
degradation kinetics of sCD4- moiety is similar to that of gpl60 which 
was nearly conqileted after 18 h. 

nL T-lympho<7te CEM cells transfected with therapeutic genes 
5 can resist HIV propagation. 

CENf(CD4+) cells are T4ymphoma cells which have CD4 on 

their cell sur&ce. These cells are susceptible for HIV infection. The 

cells are transfected with therapeutic genes which transiently expressing 

therapeutic proteins. The transfected cells, control cells transfected wiA 

10 blank vectors, and untiansfected cells are challraged with HIV. HIV 

propagation was detennined by the analysis of reverse transcriplase 

activity. 

As shown in Hguie 6, reverse transcriptase 0^1) activities at 
different time in control cells (etectn>poration without DNA), in cells 

15 separately transfected with blank vector, and vectors with one of four 
therapeutic genes (sCD4-PCaD, sCD4-HAP, sCD4-Ll and sCD4-L2). 
Transfections were done 40 h before the addition of HIV (day 0). Up to 
day 3, no difference could be seen in four cell groups. However, at day 
7, the control cells and the cells transfected with blank vector both 

20 showed a dramatic increase in RT acdvity. The agreemrats among fliese 
two controls are reasonably good. In the cells transfected with sCD4- 
fusion graes, however, die increases are only about 10-15% of the 
controls. 

The transient expression of the fusion genes usually peaks around 
25 two to three days and lasted for several days» However, the differences 
in RT activity became apparent at day 7. This means that from day 0 to 
day 3 die sCI>4*fusion genes are depriving the HIV virions of gpl60 
(gpI20); thuSr the defidency of gpl60 on the virions was manifested as 
inhibition of propagation at day 7. These data support the view tiiat the 
30 fusion genes work as intended in CEM cells challenged with HIV. 
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The results lead to the foUoMong conclusions: 
Theiapeutie genes can reverse syncytium fonnations. Thus, die 
Aerapeutic genes prevent gpl60 from reaching cell sur£au:e. When the 
sCI>4-fusion genes are expressed^ gpl60 and sCIM are degraded rapidly. 
S All evidence shows the main site of degradation is the lysosomes. When 
sCD4-PCaD is expressed, procathepsin D moiety is activated and 
correctly processed, which is exclusively a lysosomal phenomenon. 
Thus, sCD4-PCaD protein enters lysosomes. In summary,, it can be 
concluded tteit the therapeutic genes transport gpl60 to lysosomes. 

10 Therapeutic genes sCm^PCaD, sCXW-HAP, sCD4-Ll, and sCD4-L2 
inhibit HIV propagation in T-lymphocyte cell line CEM cells. 
Example 3: Expression of sCD4-PCaD gene causes gpl60 to be 
present in the lysosomes. 
To further demonstrate sCD4-fiision proteins are degraded in the 

IS lysosomes and that gpl60 is transported to the lysosomes by each of the 
4 sCD4 fusion proteins, lysosomes were fractionated in Percoll density • 
gradient centrifugation from the ^S-methionine labelled cells which had 
been either transfected with gpi60 or cotransfected with gpl60 and 
sCI>4-PCaD genes. IfeLa cells were tcansfected with gpl60 gene or 

20 cotransfected with gpl60 + sCD4-PCaD genes, labelled with ^^S- 
metfaionine, and chased for 4 h as described in the jnilse-chase 
experiments. Cells were scraped from plates, homogenized, and 
centrifuged to d)tain the postnuclear supemate as described by 
Giesehnann, J. CeU BioL 97: 1-5 (1983). This supemate (1 ml) was 

25 applied as a top layer of a 15% Percoll (Pharmacia) solution of 0.25 M 
sucrose in a Beckman polycarbonate centrifuge bottle C^^o. 355603) and 
cmtrifiiged in a Bedonan 80Ti rotor at 33,000 x g for 30 min. Fractions 
of 0.75 ml each, which were collected starting from the bottom, were, 
analyzed for jS-hexosaminidase activity (Geiger and Amon, Methods in 

30 Enzymologv 50. 547-555 (1978) and were subjected to 
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immunqpiecipitation using anti-gpl60 antiseium. The dectrophoiesis of 
piedpilate and autoradiography are the same as described above. Figure 
7A shows the jS-hexosaminidase activity (solid line), density (broken 
line), and the autoradiography of the gel electrophoresis ftom cells 
5 transfected with gpl60 alone. Figure 7B are the same data from cells 
cotiansfecled with gpl60 and sCD4-PCaD genes. 

The fractions were analyzed for lysosomes by j8-hexosaminidase 
activily and also immunoprerapitatBd with anti-gpl60 antiserum, 
dectrophoresed, and visualized by autoradiography. Figures 7A and B 

10 shows that die activity of hexoaminidase accumulates in fractions 1-3, 
which represent dense lysosomes, and is more pronounced in fractions 
10-13, which represent the light lysosomes and endosomes. The total fi- 
hexosaminidase activates are similar in two transfections (Figures 7A and 
B), meaning that the number of lysosomes are about the same in them. 

15 However, gpl60 bands are present only in the cells cotransfected widi 
sCD4-PCaD gene, not in the cells transfected with gpl60 alone. These 
observations support the view that the expression of sCD4-PCaD gene 
diverts gpldO to lysosomes. 

Example 4: Demonstration that the foslon gene can encode 
20 portions of die bmding protein and be efficacious. 

Dl and D1-D2 were tested as alternatives of die sCD4 domain in 
die sCD4-fesion genes. The results from syncytium formation and pulse- 
chase experiments indicated fliat sCm in the fusion gene can be 
substituted by its parts, d<Hnain Dl or combined domains D1-D2. 
25 The ability of Dl-HAP and D1-D2-HAP to reverse die syncytium 

formation caused by die transfection of gpl60 into HeLa-CD4+ cells. 
The experimental conditions are die same as already described for die 
oflier syncytium expraiments. It was observed diat bodi fiision gaies, 
when cotransfected wifli gpl60 gene, prevented die syncytium fiamation. 
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Pulse-chase studies using the fusion genes were also conducted in 
HeLa cells. Antiserum against gpl60 was used to immunq>recipitate. It 
was observed that Dl-HAP and D1-D2-HAP coprecipitated with gpl60. 

These observations sugg^t that Dl and D1-D2 fusion genes work 
5 as effectively as the sCD4 fusion genes. 
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(2) INFORMATION FOR SBft ID NO:ls 

(1) SEQUENCE CBARACTERISTICS: 

(A) LENGTH: 2465 base paJLTB 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S: single 
(Oy TOPOLOaV: linear 

(ii) HOLSCULE TXPSs DNA (genomic) 

(iii) HyPOTHETXCALr NO 

(iv) ANTI-SENSE; NO 

( vi ) ORXGINAI. SOURCE : 

(A) ORGANISM: Homo sapien 
(P) TISSUE TTPE: Epithelial 

(ix) FEATURE: 

(A) NAME/KEYS misc^feature 

(B) LOCATION: 1..6 

(D} OTHER INFORMATION: /note» "Restriction site" 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 2460. .2465 

(D) OTHER INFORMATION: /note« "Restriction site" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



GAATTCAA6C 


CCAGAGCCCT 


GCCATTTCTG 


T6G6CTCA66 


TCCCTACTGC 


TCAGCCCCTT 


60 


CCTCCCTCGG 


CAAGGCCACA 


ATGAACCG6G 


6AGTCCCTTT 


TAGGCACTT6 


CTTCTG6T6C 


120 


TGCAACT6GC 


GCTCCTCCCA 


GCAGCCACTC 


AGGGAAAGAA 


AGTGGTGCTG 


GGCAAAAAAG 


180 


GGGATACAGT 


66AACIGACC 


TGTACAGCTT 


CCCAGAAGAA 


GA6CATACAA 


TTCCACTGGA 


240 


aUJlACTCCAA 


CCAGATAAAG 


ATTCTGG6AA 


ATCA66GCTC 


CTTCTTAACT 


AAAOGTCCAT 


300 


CCAA6CTGAA 


T6AT06C6CT 


6ACTCAA6AA 


GAAGCCTTTG 


66ACCAAGGA 


AACTTCCCCC 


360 


TGATCATCAA. 


GAATCTTAAG 


ATAGAAGACT 


CAGATACTTA 


CATCTGTGAA 


GT66A6GACC 


420 


A6AA6GA6GA 


6GT6CAATTG 


CTAGTGTTCG 


GATTGACTGC 


CAACTCTGAC 


ACCCACCTGC 


480 


TTCAGGGGCA. 


6AGCCTGACC 


CTGACCTT6G 


AGAGCCCCCC 


T66TAGTAGC 


CCCTCAGTGC 


540 


AATGTAG6A6 


TCCAAGGGGT 


AAAAAGATAC 


AG6GGGG6AA 


GACCCTCTCC 


6T6TCTCAGC 


600 


TGGAGCTCCA 


GGATAGTGGC 


ACCTGGACAT 


GCACT6TCTT 


GCA6AACCA6 


AAGAA66TG6 


660 


A6TTCAAAAT 


A6ACATCGTG 


GT6CTA6CTT 


TCCAGAAGGC 


CTCCAGCATA 


G7CTATAAGA 


720 
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MtGAGGGGGA ACA6GT6GAG TTCTCCTTCC 
GCA6TGGC6A GCTGTGGTGG CAG6CGGAGA 
TTGACCI6AA GAACAAGGAA GTGTCTGTAA 
TOGGCAA6AA GCTCCC6CTC CACCTCACCC 
CTGGAAACCT CACCCTG6CC CTT6AA6C6A 
T6GT6GTGAT GAGA6CCACT CAGCTCCAGA 
CCTCCCCTAA GCT6ATGCTG AGCTTGAAAC 
GGGAGAAGGC GGTGTG66TG CT6AACCCTG 
ACTC6GGAGA GGTCCTGCTG GAATCCAACA 
TGCAGCCAAT GGCCTCTAGA CTCGTCAGGA 
GGACCATGTC GGA6GTXGGG GGCTCTGTG6 
AGTACTCCCA GGCG6TGCCA GCCGT6AC06 
ACAT66ACGC CCAGTACTAC 6GGGAGATT0 
TCGTCTTC6A GACG6GCTCC TCCAACCTGT 
ACATC6CTTG CTG6ATCCAC CACAAGTACA 
AT6GTACCTC 6TTTGAGATC CACTAT6GCT 
ACACTGTGTC GGTGCCCTGC CAGTGAGCGT 
TGGAGAGGCA G6TCTTTGGG GAGGCCACCA 
AGTTCGATGG CATOCTGGGC ATGGCCTACC 
TCtTCGACAA CCTGAT6CAG CAGAAGCT6G 
6CAGGGACCC AGAT6C6CAG CCTGGGGGT6 
ATTACAAG6G TTCTCT6TCC TACCT6AAT6 
TGGACCAGGT GGAGGTGGCC A6CGGGCTGA 
TGGACACAGG CACTTCCCTC ATGGT6GGCC 
CCATGGGGGC 06TGCCGCTG ATTCAGGGCG 
CCCT6CCCGC GATCACACTG AAGCTGGGAG 
ACACGCTCAA GGT6TCGCAG 6CCGGGAAGA 
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CACTCGCCTT TACA6TTGAA AAGCTOACGG 780 
GGOCTTCCTC CTCCAAGTCT TGGATCACCT 840 
AACGGGTTAC CCAGGACCCT AAGCTCCAGA 900 
TGCCCCAGGC CTTGCCTCAG TATGCTGGCT 960 
AAACAGGAAA GTT6CATCA6 GAAGTGAACC 1020 
AAAATTTGAC CTGTGAGGTG TGGGGACCCA 1080 
TGGAGAACAA GGAGGCAAAG GTCTCGAAGC 1140 
AGGCG6GGAT GTG6CAGTGT CTGCTGAGTG 1200 
TCAAGGTTCT GCCCACATGG TCCACCCC6G 1260 
TCCOGCTGCA CAA6TTCACG TCCATCC6CC 1320 
AGGACCTGAT TGCCAAAGGC CCOGTCTCAA 1380 
AGGG6CCCAT TCCCSGAGGT6 CTCAAOAACT 1440 
GCATCGGGAC GCCCCCCCAG TGCTTCACAG 1500 
GGGTCCCCTC CATCCACTGC AAACTGCTGG 1560 
ACAGCGACAA GTCCAGCACC TACGTGAAGA 1620 
CGGGCAGCCT CTCOGGGTAC CTGAGCCAGG 1680 
CGTCAGCCTC TGCCCTGGGC 6GT6TCAAA6 1740 
AGCAGCCAGG CATCACCTTC ATOQCAGCCA 1800 
CCC6CATCTC CGTCAACAAC GTGCTGCCCG 1860 
TGGACCAGAA CATCTTCTCC TTCTACCT6A 1920 • 
AGCTGATGCT 6GGTGGCACA 6ACTCCAA6T 1980 
TCACXCGCAA G6CCIACT6G CAGGTCCACC 2040 
CCCTGTGCAA GGAGGGCTGT GA6GCCATTG 2100 
CG6TGGATGA GGTGCGCGAG CTGCAGAAGG 2160 
AGTAGATGAT CCCCTGTGAG AA6GTGTCX»V 2220 
GCAAAGGCTA GAAGCTGTCC CCAGA6GACT 2280 
CCCTCTGCCT GA606GCTTC ATGGGCAT6G 2340 
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ACaiCCeXSCC ACCCaOCGGG CCacrcrGGA TCCTGGGCGA CGTCTTCATC GGCCGCTACT 2400 
ACaCTOTGTT TSACCGTGAC AACAACACWG TG6GCTTCGC CGAG0CT6CC CGCCTCTA6C 2460 
AGCTG 246S 



(2) INFORHA.TXOK FOR SEQ H) HO: 2; 

(1) SEQOEKCE CHAR2UrCERISTICS: 

(A) LENGTH; 1448 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS: Bingle 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TXPE: DNA (genomic) 

(iii) H!fPOTBETXCAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo fiapien 
(F) TISSUE TZPE: Epithelial 

(ix) FEATURE: 

(A) NAME /KEY": misc^feature 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /note= "Restriction site" 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1275.. 1280 

(D) OTHER INFORMATION: /note* "Restriction site" 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1444 ..1448 

(D) OOHBR INFORMATION: /note= "Restriction site" 



(xi) SEQUENCE DESCRIPTION: SEQ TO NO: 2 s 
GAATTCAAGC CCA6A6CCCT GCCATTTCTG TGGGCTCAGG TCCCTACT6C TCAGCCCCTT 60 
CCTCCCTCGG CAAGGCCACA ATGAACCGGG GAGTCCCTTT TAGGCACTT6 CTTCTGGTGC 120 
TGCAACTGGC GCTCCTCCCA GCAGCCACTC AGGGAAAGAA A6T6GTGCTG GGCAAAAAA6 180 
GGGAiaCAGT GGAACT6ACC TGTACA6CTT CCCAGAA6AA GAGCATACAA TTCCACTGGA 240 
AAAACTCCAA CCAGATAAAG ATTCTGGGAA ATCAGGGCTC CTTCTTAACT AAAGGTCCAT 300 
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CCAA6CTGAA T6ATCOCX3CT 6ACTCAAGAA 6AAGCCTTTG GGACCAAG6A AACTTCCCCC 360 
TGATCATCAA GAATCTTAA6 ATA6AAGACT CAGATACTTA CATCTGTGAA GTGGAGGACC 420 
AGAAGGAGGA 6GTGCAATTG CTAGTGTTCG 6ATTGACTGC CAACTCTGAC ACCCACCTGC 480 
TTCAGGGGCA GAGCCTGACC CTGACCTTG6 AGAGCCCCCC TGGTAGTAGC CCCTCAGTGC 540 
AATGTAGGAG TCCAAGGGGT AAAAACATAC A6GGGGGGAA GACCCTCTCC GT6TCTCAGC 600 
T66A6CTGCA GGATAGTGGC AOCTGGACAT GCACTGTCTT GCAGAACCAG AAGAAGGTGG 660 
A6TTCAAAAT AGACATCGT6 6TGCTA6CTT TCCAGAAGGC CTCGAGCATA GTCTATAA6A 720 
AAGAGGGGGA ACA66XGGAG XTCTCCTTCC CACTC6CCTT TACAGTXGAA AAGGTGACGG 780 
GCAGTGGCGA GCTGTGGTGG CAGGCGGAGA GGGCTTCCTC CTCCAA6TCT TGGATCACCT 840 
TTGACCTGAA GAACAAGGAA GTGTCTGXAA AAC6GGXXAC CCAGGACCCX AAGCXCCAGA 900 . 
XG6GCAAGAA GCXCCCGCXC CACCXCACCC TGOCCCAGGC CTX6CCXCA6 TAX6CT6GCT 960 
CXG6AAACCX CACCCXGGCC CXX6AAGCGA AAAGAGGAAA GtXGCAXCAG GAAGX6AACC 1020 
XGGXGGXGAX GAGAGCCACX CAGCXCCAGA AAAAXXXGAC CXGXGAGGXG XGGGGACCCA 1080 
CCTCCCCTAA GCXGAXGCXG AGGXXGAAAC XGGAGAACAA GGAGGCAAAG GXCXCGAAGC 1140 
GGGAGAAGGC GGXGXGGGXG CXGAACCCX6 AGGCX3GGGAX GXG6CAGXGX CXGCTGAGXG 1200 
ACXC6GGAGA GGXCCXGCTG GAAXCCAACA XCAA6GXXCX GCCXIACATGG TCCACCCCGG 1260 
XGCAGCCAAX GGCCXCXAGA CAGCXGGCAA GCGGTCCXGC A6ACACAGAG GXGATXGXGG 1320 
CCXXGGCX6I AXGIGGCXCC AXCCXCXXCC XCCXCAXAGX GCTGCXCCXC ACOGTCCXCT 1380 
XCGGGAXGCA GGCCCAGCCX CCX66CXACC 6CCAC6XC6C AGATGGGGA6 GACCAC6CGX 1440 
GAGXCGAC 1448 

(2) INFORMAXION FOR SEQ ID NO: 3: 

( i.) SSQUENCB CHARACXERZSTICS: 

(A) LENGTH: 1421 base pairs 
. (B) XYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) X0P0L06Y: linear 

(ii) KOLBCDLS TYPE: DNA (genomic) 
(iii) ByPOTHEXICAL: NO 
(iv) ANXI-SENSB: NO 
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(vi) ORIGINMi SODRCEt 

(H) 0R6AMXSH: Homo aapien 
(F) TISSUE TYPE: Epithelial 

(ix) FEATORE: 

(A) NAME/KEY: miac^feature 

(B) LOCATION: U.s' 

(D) OTHER INFORMATION: /note- "Restriction site 

(ix) FEATURE: 

(A) NAME/KEY: miflc^feature 

(B) LOCATION: 1275.. 1280 

(D) OTHER INFORMATION: /xiote^ "Restriction site** 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1416, .1421 

(D) OTHER INFORMATION: /not6= "Restriction site" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



GAATTCAA6C 


CCA6A6CCCT 


6CCATTTCTG 


T6GGCTCAGG 


TCCCTACT6C 


TCAGCCCCTT 


60 


CCTCCCTCGG 


CAAG6CCACA 


AT6AACCG6G 


6AGTCCCTTT 


TA66CACTTG 


CTTCTGGT6C 


120 


TGGAACTG6C 


GCTCCTCCCA 


6CAGCCACTC 


AGGGAAAGAA 


AGTG6T6CTG 


GGCAAAAAAG 


180 


6GGATACA6T 


66AACT0ACC 


TGTACAGCTT 


CCCAGAAGAA 


6A6CATACAA 


TTCCACTG6A 


240 


AAAACTCCAA 


CCA6ATAAAG 


ATTCT6GGAA 


ATCA666CTC 


CTTCTTAACT 


AAAGGTCCAT 


300 


CCAA6CT6AA 


T6ATCGCGCT 


GACTCAA6AA 


GAAGCCTTTG 


G6ACCAA6GA 


AACTTCCCCC 


360 


T6ATCATCAA 


GAATCTTAA6 


ATA6AA6ACT 


CAGATACTTA 


CATCTGTGAA 


6T6GAGGACC 


420 


AGAA66A60A 


GGT6CAATTG 


CTAGT6TTC6 


6ATTGACT6C 


CAACTCT6AC 


ACCCACCT6C 


480 


TTCA66GGCA 


GAGCCTGACC 


CTGACCTTG6 


AGAGCCCCCC 


TGGTAGTAGC 


CCCTCA6TGC 


540 


AAT6TAGGA6 


TCGAAGGG6T 


AAAAACATAC 


AGGGGGGGAA 


GACCCTCTCC 


6TGTCTCAGC 


600 


T66A6CTCCA 


6GATAGTGGC 


ACCTGGACAT 


GCACTGTCTT 


GCAGAACCA6 


AAGAAGGTGG 


660 


A6TTCAAAAT 


A6ACATCGTG 


GTGCTAGCTT 


TCCAGAAG6C 


CTCCA6CATA 


GTCTATAAGA 


720 


AA6AGGG6GA 


ACA6GTGGAG 


TTCICCTTCC 


CACT06CCTT 


TACAGTTGAA 


AAGCT6ACGG 


780 


6CAGTG6CGA 


GCTGT6GTGG 


GAGGC6GAGA 


G6GCTTCCTC 


CTCCAAGTCT 


TGGATCACCT 


840 


TT6ACCTGAA 


GAACAAGGAA 


GTGTCTGTAA 


AACG66TTAC 


CCA6GACCCT 


AA6CTCCAGA 


900 


TGG6CAAGAA 


GCTCCCGCTC 


CACCTCACCC 


TGCOCCAGGC 


CTTGCCTCAG 


TATGCTGGCT 


960 
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CTGGAAACCT CACCCTGGCC CTT6AAGC6A AAACAGGAAA 6TT6CATCAG 6AAGTGAACC 1020 
T66T6GTGAT GAGAGCCACT CAGCTCCA6A AAAATTTG2VC CTGTGAGGTG T6GGGACCCA 1080 
CCTCCCCTAA GCTGATGCTG AGCTTGAAAC TGGAGAACAA GGAGGCAAA6 GTCTCGAAGC 1140 
GGGAGAAGGC 6GT6TGG6TG CTGAACCCTG AGGCGGGGAT GT6GCAGTGT CT6CTGAGTG 1200 
ACTCGGGACA GGTCCT6CTG GAATCCAACA TCAAGGTTCT GCCCACAT6G TCCACCCC6G 1260 
TGCAGCCAAT GGCCTCTAGA CT6CTGGACG AGAACAGCAC GCTGATCCCC ATCGCTGTGG 1320 
GTGGTGCCCT GGCGGGGCTG GTCCTCATCG TCCTCATCGC CTACCTCGTC GGCAGGAA6A 1380 
GGAGTCACGC AGGCTACCAG ACTATCTAGC CTGGTGTCGA C 1421 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHAHACTERISTICS: 

(A) LENGTH: 1415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) HOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapien 
(F) TISSUE TYPE: epithelial 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /note» "Restriction site** 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1275.. 1280 

(D) OTHER INFORMATION: /notes "Restriction site" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 14107.1415 

(D) OTHER INFORMATION: /note= "Restriction site" 
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(xi) S] 


3QUENCE DESCRIPTION: SI 


ID N0:4: 




6AATTCAA6C 


CCAGAGCCCT 


6CCATTTCIG 


TGGGCTCAG6 TCCCTACTOC TCAGCCCCTT 


60 


CCTCCCTCGG 


CAAGGCCACA 


ATGAACCGGG 


GAGTCCCTTT TAGGCACTTG CTTCTGGTGC 


120 


T6CAACIGGC 


6CTCCTCCCA 


6CAGCCACIC 


AGGGAAAGAA AGTGGTGCTG GGCAAAAAAG 


180 


GG6ATAGA6T 


6GAACTGACC 


TGTACAGCTT 


CCCA6AAGAA 6AGCATACAA TTCCACI6GA 


240 




CCAGATAAAG 


ATTCTGGGAA 


ATCAGGGCTC CTTCTTAACT AAAGGTCCAT 


300 


CCAAGCTGAA 


TGATCGCGCT 


GACTCAAGAA 


GAAGCCTTTG 6GACCAAGGA AACTTCCCCC 


360 


TGATCATCAA 


GAATCTTAAG 


ATAGAAGACT 


CAGATACTTA CATCTGTGAA GTGGAGGACC 


420 


A62UV6GAGGA 


GGTGCAAITG 


CTAGTGTTCG 


GATOJGACTGC CAACTCTGAC ACCCACCTGC 


480 


TTCAGGGGCA 


GAGCCTGACC 


CTOACCTTCG 


A6AGGCCCCC TGGTAGTAGC CCCTCAGTGC 


540 


AAT6TAGGAG 


TCCAAGGGGT 


AAAAACATAC 


AGGGGGGGAA GACCCTCTCC 6T6TCTCAGC 


600 


SGGAGCOXJCA 


GGATAGTGGC 


ACCTGGACAT 


6CACTGTCTT GCAGAACCAG AAGAAGGTGG 


660 


AGTTC3UUULT 


A6ACATCGT6 


GTGCTAGCXT 


TCGAGAAGGC CTCCA6CATA 6TCXATAAGA 


720 


AA6AG0G66A 


ACA6GTGGAG 


TTCTCCTTCC 


CACTC6CCTT TACAGTTGAA AAGCTGACGG 


780 


6CA6T66C6A 


6CTGTGGTGG 


CAGGCGGAGA 


6G0CTTCCTC CTCCAAGTCT TCGATCACCT 


840 


TT6ACCTGAA 


GAACAAGGAA 


GTGTCTGTAA 


AACGGGTTAC CCAGGACCCT AAGCTCCAGA 


900 


T6G6CAAGAA 


GCTCCCGCTC 


CACCTCACCC 


TGCCCCAGGC CTTGCCTCAG TATGCTGGCT 


960 


CTGGAAACCT 


CACCCTGGCC 


CTTGAA606A 


AAACA6GAAA GTTGGATCAG GAAGTGAACC 


1020 


TGGTGGTGAT 


GAGAGCCACT 


CA6CTCCAGA 


iiAAATTTGAC CT6TGAGGTG TGGGGACCCA 


1080 


CCTCCCCTAA 


GCTGATGCTG 


AGCTTGAAAC 


TGGAGAACAA GGA66CAAAG 6TCTC6AAGC 


1140 


GG6A6AAGGC 


GGTGTGGGTG 


CTGAACCCTG 


AGGOGGGGAT 6TGGCAGTGT CTGCTGAGTG 


1200 


ACTC6GGACA 


GGTCCTGCTG 


6AATCCAACA 


TCAA6GTTCT 6CCCACATGG TCCACCCCGG 


1260 


TGCAGCCAAT 


GGCCTCIAGA 


AGTGGAGATG 


ACGACAACTT CCTTGT6CCC ATAGOGGTGG 


1320 


GAGCTGCCTT 


6GCA6GA6TA 


CTTATTCXAG 


TGTTGCTGGC TTATTTTATT GGTCTCAAGC 


1380 


ACCATCAT6C 


IG6ATATGAG 


CAATXTTAGG 


TCGAC 


1415 
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(2) INFORMI^TION FOR SSQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 6 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /notes "ER retention signal" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:Ss 

Ser Glu Lys Asp Glu Leu 
1 5 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) mLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAHE/KEY: misc_feature 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /note= "PCR Primer P-1 used in 
construction of sCD4-PCaD* 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO16: 
6AAXTCAA6C CCA6AGCCCT GCC 



(2) INFORMATION FOR SEQ ID N0i7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) Tn£: nucleic acid 

(C) STRANDSDNESSt single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL i NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /note* "PCR Primer P-2 used in 
construction of sCD4-PCaD" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TCTAGAGGCC ATT6GCTGCA CC6 



(2) INFORMATION FOR SEQ ID NOs8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH? 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY r linear 

(11) MOLECULE TYPE: DNA (genomic) 

(iil> HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1..23 
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(D) OTHER INFORMATION: /note« "PGR Primer P-3 used in 
construction of BCD4-PCaD*' 



(xi) SEQUENCE DESCRIPTZON: SEQ ID NO: 8: 
TCTA6ACTCG TCAGGATCCC 6CTG 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS! 
(A) LENGTH: 22 base pairs 
<B) TYPE: nucleic acid 

(C) STRANOSDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

<A} ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1..22 

(D) OTHER INFORMATION; /note= "PCR Primer P-;4 used in 
construction of sCD4-PCaD'' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
6TCGACCTA6 A6GCGG6CA6 CC 



(i) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATIONt 1.-25 

(D) OTHER INFORMATION: /note= "PGR Primer P-1 used 
construction of SCD4-HAP" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TCTAGACA6C TG6GAAGC66 TCCTG 



(2) INFORHATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) N2iHE/KEY: misc^feature 

(B) LOCATION: 1,.24 

(D) OTHER INFORMATION: /note- "PGR Primer P-2 used in 
construction of SCD4-HAP'' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTCGACXCAG 6C6TG6TCCT CCCC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Hi) HYPOTHETICAL: NO 
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(iv) ANTI-*SENS£t NO 

(vi) ORIGINAL SOURCE; 

(A) OR62^ISH: Homo sapiens 

(ix) FEATORE: 

(A) NAMS/KSy: misc_feature 
<B) LOCATION: 1..26 

(D) OTHER INFORMATION: /note" "PGR Primer P-1 used in 
construction of 8CD4*-L1" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOtl2: 
TCTA6ACT6C TGGAC6AGAA CAGCAC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 

(li) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /note= "PCR Primer P-2 used in 
construction o£ sCDA-Ll"- 



(xi) ffiQUENCE DESCRIPTION: SEQ ID NO: 13: 
6TC6ACACCA 66CTAGATAG TCTG6TA6 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECDLE TYPE: DNA (genomic) 

(iii) HYPOTHETICALs NO 

(iv) ANTI-SENSE: NO 

(Vi) ORIGINM. SOURCE; 

(A) ORGANISM^ Homo sapiens 

(ix) FEATURES 

(A) NAME /KEY: misc^feature 

(B) LOCATIONS 1..27 

(D) OTHER INFORMATION: /note« -PGR Primer P-1 used in 
construction of 8CD4-L2" 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
TCTAGAA6TG CA6AT6ACGA CAACTTC 



(2) INFORMATION FOR SEQ TlX N0:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: misc_feature 

(B) LOCATION: 1..30 

(D) OTHER INFORMATION: /note= "PCR Primer P-2 used in 
construction of SCD4-L2" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GTC6ACCTAA AATTGCTCAT ATCCAGCAT6 



wo 93/06216 




PCT/US92/08090 



We claim: 

1. A fusion protein compiising: 

a protein which binds to a retroviral envelope protein and 
a protein or domain of the protdn which targets the &sion 
protein to a lysosome by causing the protein to be delivered to a 
lysosome when expressed within a mammalian cell. 

2. The protein of claim 1 wherein the binding protein is soluble 

CD4. 

3. The protein of claim 1 wherein the binding protein is selected 
^ from the group consisting of sCD4 domain Dl and sCD4 combined 

domains D1-D2. 

4. The protein of claim 1 wherein the targeting protein is selected 
from the group consisting of procafliepsin D (PCaD), human lysosomal 
membrane protein lamp-1, human lysosomal membrane protein lamp-2, 
acid phosphatase, and portions thereof targeting the protein to a 
lysosome. 

5. The protein of claim 1 wherein the targeting protein is a 
lysosomal proenzyme which contains a structural marker for lysosomal 
targeting. 

6. The protein of claim 5 selected from the group consisting of 
pro-cathepsin D, a-N-Acetylgalactosaminidase, Glycosylasparaginase^ 
Glucocerebrosidase, Procathq>sin L, Procathepsin B, and Procathepsin E. 

7. The protein of claim 1 wherein the targeting protein is taken 
from part of the lysosomal membrane proteins, lamp-1 (LI), lamp'2 
(L2), and lysosomal acid phosphatase (HAP). 

8. The protein of claim 1 wherein there are multiple targeting 
proteins. 

9. The protein of claim 8 wherein the fusion protein is selected 
from the group consisting of sCD4-PCaD-HAP and sCI>4-PCaD-PCaD. 
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10. The protdn of daim 1 wbiaaa the taigeting protein and the 
binding protein are separated by a peptide sequence. 

11. A gene encoding a fusion protein comprising: 

a protein which binds to a retroviral envelope protein and 
a protein or domain of the protein which targets the fusion 
protein to a lysosome by causing die protein to be ddivered to a 
lysosome when ex^nessed within a manomalian cell. 

12. The gene of daim 11 encoding a protein selected from the 
group consisting of sCD4-PCaD, sCD4-HAP, sCD4-Ll, and sCD4-L2. 

13. The gene of daim 11 encoding a fusion protein wherein the 
binding protem is soluble CD4. . 

14. The gene of claim 11 encoding a fiision protein wherein flie 
binding protein is selected from the group consisting of 5CD4 domain Dl 
and sCD4 combined domains D1-D2. 

15. The gene of daim 11 encoding a fiision protein wherein the 
targeting protein is sdected ftom the group consisting of procathepsin D 
(PCaD), human lysosomal m^brane ptateki lamp-1, human lysosomal 
membrane protein lamp-2, add phosphatase, and portions thereof 
taigeting the protein to a lysosome. 

16. The gene of claim 11 oicoding a fiision protein wherdn the 
taigeting protein is a lysosomal proenzyme \^ch contains a structural 
marker for lysosomal targeting. 

17. The gene of claim 11 encoding a fusion protein selected from 
the group consisting of pro-cathepsin D, a-N-Acetylgalactosaminidase, 
(^cosylaqaiaginase, Glucocerebrosidase, Procathepsin L, Procathepsin 
B, and Procathepsin E. 

18. The gene of daim 10 encoding a fusion protein ^rem the 
targeting protein is takM bam part of the fysosomal menibrane proteins, 
lamp-1 (LI), Iamp-2 (L2), and lysosomal add phosphatase (HAP). 
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19. The gene of daim 10 encoding a fusion protein wherein Aere 
are multiple targeting proteins. 

20. The gene of claim 10 encoding a fusion protein wherein the 
fusion protein is selected fifom die group consisting of sCD4-PCaD-HAP 
and sCD4-PCaD-PCaD. 

21. The gene of claim 11 further ccnnprising a vector. 

22. A metfiod for treating a viral disease comprising introducing 
into cells that are infected or exposed to a retrovirus a gene encoding a 
fusion protein comprising: 

a protein vMch binds to a retroviral envelope protein and 
a protein or domain of the protein ^ch targets the fusion 
protein to a lysosome by causing the protein to be delivered to a 
lysosome when expressed within a mammalian cell. 

23. The method of claim 22 wherein the virus is human 
immunodeficiency virus. 

24. The method of claim 22 wherein the gene is introduced into 
the cells widiin a viral vector. 

25. The method of claim 22 wherein the gene encodes a protein 
selected from the group consisting of sCD4-PCaD, SCD4-HAP, sCD4- 
Ll, and sCD4-L2. 

26. The method of daim 22 wherein the gene encodes a fiision 
protein ixiierein the binding protein is selected from the group consisting 
of sCD4, sCD4 domain Dl and sCD4 combined domains D1-D2. 

27. The mediod of claim 22 wherein the gene encodes a fusion 
protein wherein tfie targeting protein is selected from the group consisting 
of procathepsin D (PCaD), human lysosomal mmibrane protein lamp-1, 
human lysosomal membrane protein lamp-2, add phosphatase, pro- 
catfaq>sin D, a-N-Acetylgalactosaminidase, Glycosylasparaginase, 
Ghicocerebrosidase, Procathepsin L, Procathepsin B, and Procathq>sin £, 
aiul portions tfiereof targeting the protein to a lysosome. 
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28. The method of daim 22 v/bexwi the gene encodes a fusion 
protein including mnltiple targeting jnoteins. 
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